History-dependent Evaluations in Partially Observable Markov Decision Process

نویسندگان

چکیده

Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 20 April 2020Accepted: 03 February 2021Published online: 29 2021KeywordsMarkov decision process, partial observation, long-run average payoffAMS Subject Headings90C39, 90C40, 37A50, 60J20Publication DataISSN (print): 0363-0129ISSN (online): 1095-7138Publisher: Society for Industrial and Applied MathematicsCODEN: sjcodc

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust partially observable Markov decision process

We seek to find the robust policy that maximizes the expected cumulative reward for the worst case when a partially observable Markov decision process (POMDP) has uncertain parameters whose values are only known to be in a given region. We prove that the robust value function, which represents the expected cumulative reward that can be obtained with the robust policy, is convex with respect to ...

متن کامل

The Infinite Partially Observable Markov Decision Process

The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning domains where agents must balance actions that provide knowledge and actions that provide reward. Unfortunately, most POMDPs are complex structures with a large number of parameters. In many real-world problems, both the structure and the parameters are difficult to specify from domain knowledge alo...

متن کامل

Partially observable Markov decision processes

For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In many problem domains, however, an agent suffers from limited sensing capabilities that preclude it from recovering a Markovian state signal from its perceptions. Extending the MDP framework, partially observable Markov...

متن کامل

Partially Observable Markov Decision Process Approximations for Adaptive Sensing

Adaptive sensing involves actively managing sensor resources to achieve a sensing task, such as object detection, classification, and tracking, and represents a promising direction for new applications of discrete event system methods. We describe an approach to adaptive sensing based on approximately solving a partially observable Markov decision process (POMDP) formulation of the problem. Suc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Siam Journal on Control and Optimization

سال: 2021

ISSN: ['0363-0129', '1095-7138']

DOI: https://doi.org/10.1137/20m1332876